Skip to content

Conversation

@merykitty
Copy link
Member

@merykitty merykitty commented Oct 3, 2025

Hi,

This PR improves the implementation of AndNode/OrNode/XorNode::Value by taking advantages of the additional information in TypeInt. The implementation is pretty straightforward. A clever trick is that by analyzing the negative and positive ranges of a TypeInt separately, we have better info for the leading bits. I also implement gtest unit tests to verify the correctness and monotonicity of the inference functions.

Please take a look and leave your reviews, thanks a lot.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8367341: C2: apply KnownBits and unsigned bounds to And / Or operations (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27618/head:pull/27618
$ git checkout pull/27618

Update a local copy of the PR:
$ git checkout pull/27618
$ git pull https://git.openjdk.org/jdk.git pull/27618/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27618

View PR using the GUI difftool:
$ git pr show -t 27618

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27618.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 3, 2025

👋 Welcome back qamai! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 3, 2025

@merykitty This change is no longer ready for integration - check the PR body for details.

@openjdk
Copy link

openjdk bot commented Oct 3, 2025

@merykitty The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 3, 2025
@mlbridge
Copy link

mlbridge bot commented Oct 3, 2025

Webrevs

Copy link
Member

@SirYwell SirYwell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change overall.
I'm not sure how "easily" we can really see the benefit in the example of the interval splitting, but I leave that to others to judge.

I was just wondering, do you think it makes sense to move more such code into the RangeInference classes in future (e.g., for shift ops) or how we'll tell what to place where. From what it looks like the main reason currently is to use the TypeIntMirror classes for testability, which other node types definitely could benefit from as well.

@merykitty
Copy link
Member Author

I'm not sure how "easily" we can really see the benefit in the example of the interval splitting, but I leave that to others to judge.

Without it, the simple inference function fails AndLNodeIdealizationTest because the current version also splits the analysis between the negative part and the non-negative part.

I was just wondering, do you think it makes sense to move more such code into the RangeInference classes in future (e.g., for shift ops) or how we'll tell what to place where. From what it looks like the main reason currently is to use the TypeIntMirror classes for testability, which other node types definitely could benefit from as well.

Yes that is entirely my intention, that for example, we only need to implement RangeInference::infer_left_shift and the unittest can be a simple:

class OpLeftShift;
class InferLeftShift:

TEST(opto, range_inference) {
  test_binary<OpLeftShift, InferLeftShift>();
}

@merykitty
Copy link
Member Author

@eme64 I think it would be great if you take a look at this PR.

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@merykitty Thank you very much for working on this, very exciting. And it seems that the actual logic is now simpler than all the custom logic before!

However, we need to make sure that all cases that you are not deleting are indeed covered.

  1. OrINode::add_ring
  if ( r0 == TypeInt::BOOL ) {
    if ( r1 == TypeInt::ONE) {
      return TypeInt::ONE;
    } else if ( r1 == TypeInt::BOOL ) {
      return TypeInt::BOOL;
    }
  } else if ( r0 == TypeInt::ONE ) {
    if ( r1 == TypeInt::BOOL ) {
      return TypeInt::ONE;
    }
  }

That seems to be covered by KnownBits.

  1. OrINode::add_ring
  if (r0 == TypeInt::MINUS_1 || r1 == TypeInt::MINUS_1) {
    return TypeInt::MINUS_1;
  }

Seems also ok, handled by the KnownBits.

  1. OrINode::add_ring
  // If either input is not a constant, just return all integers.
  if( !r0->is_con() || !r1->is_con() )
    return TypeInt::INT;        // Any integer, but still no symbols.

  // Otherwise just OR them bits.
  return TypeInt::make( r0->get_con() | r1->get_con() );

Constants would also be handeld by KnownBits.

  1. xor_upper_bound_for_ranges
    I think also this should be handled by doing KnownBits first, and then inferring the signed/unsigned bounds, right?

  2. and_value
    Does not look so trivial. Maybe you can go over it step by step, and leave some GitHub code comments?

Comment on lines +211 to +218
// These allow TypeIntMirror to mimick the behaviors of TypeInt* and TypeLong*, so they can be
// passed into RangeInference methods. These are only used in testing, so they are implemented in
// the test file.
const TypeIntMirror* operator->() const;
TypeIntMirror meet(const TypeIntMirror& o) const;
bool contains(U u) const;
bool contains(const TypeIntMirror& o) const;
bool operator==(const TypeIntMirror& o) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we limit this to DEBUG_ONLY?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, it disables these gtest in product builds, however. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I see. I suppose we can keep it. Can you somehow make it clear which block it is, maybe with some start/end markers?

I was wondering if the method below is still part of it, but I don't think so. But it was unclear at first.

#include <array>
#include <limits>
#include <type_traits>
#include <unordered_set>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the current state of code style guide: but are we allowed to use std::unordered_set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a better way, we have HashTable but it is terrible since the table size is fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. I'll ask some folks who might know / have an anser / opinion ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It surely would be very easy, and not affect the product. But let's see what they say.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They tell me it is fine, and we are already doing similar things here:
test/hotspot/gtest/jfr/test_networkUtilization.cpp

Copy link

@kimbarrett kimbarrett Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a review, just a drive-by comment, following up on @eme64 "They tell me its fine".

I do not think it's okay to use most standard library headers. Doing so can run into issues with things
like our forbidden function mechanism, assert macro collision, and others. My opinion is the uses in
jfr/test_networkUtilization.cpp shouldn't be there, and aren't actually necessary. I just did a spot check,
and the only "good" case I found is test_codestrings.cpp using <regex>, where there isn't any similar
functionality available in hotspot. The suggestion in the discussion @eme64 for a set is RBTree. The O(1)
lookup by a hashtable is unlikely to matter to a gtest.

There is ongoing work updating our usage (see, for example, https://bugs.openjdk.org/browse/JDK-8369186)
and how to do that in a safe and consistent manner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use RBTreeCHeap, if going the RBTree route. It's just the easiest way of using it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your inputs, I have removed the usage of std::unordered_map and replaced it with RBTreeCHeap. Is using std::array here fine?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @merykitty,

I think that we don't use the STL because we run without exceptions and because we want our production data structures to have custom allocators, and history :-). As std::array (AFAIU) is 'just' a typed and sized T*, I think it should be fine, as long as you avoid things that might throw!

@merykitty
Copy link
Member Author

@eme64 Thanks for your review, I believe I have addressed all of your suggestions.

However, we need to make sure that all cases that you are not deleting are indeed covered.

For this, from the testing POV, all the current idealization tests pass.

From the theoretical POV, let me present it below:

For Xor:

return round_up_power_of_2(U(hi_0 | hi_1) + 1) - 1; // This should be trivially covered by `KnownBits`, since it tries to deal with the highest bits that are known to be 0 in both inputs

For Or:

// If both args are bool, can figure out better types
if ( r0 == TypeInt::BOOL ) {
  if ( r1 == TypeInt::ONE) {
    return TypeInt::ONE; // Trivial, since all bits except the lowest is 0 in both inputs, and the lowest bit is 1 in the second input
  } else if ( r1 == TypeInt::BOOL ) {
    return TypeInt::BOOL; // Trivial, since all bits except the lowest is 0 in both inputs
  }
} else if ( r0 == TypeInt::ONE ) {
  if ( r1 == TypeInt::BOOL ) {
    return TypeInt::ONE; // Same as above
  }
}

if (r0 == TypeInt::MINUS_1 || r1 == TypeInt::MINUS_1) {
  return TypeInt::MINUS_1; // Trivial, since all bits is 1 in 1 of the inputs
}

// If either input is not a constant, just return all integers.
if( !r0->is_con() || !r1->is_con() )
  return TypeInt::INT;        // Any integer, but still no symbols.

// Otherwise just OR them bits.
return TypeInt::make( r0->get_con() | r1->get_con() ); // Constant folding is trivial

For And:

// If both types are constants, we can calculate a constant result.
if (r0->is_con() && r1->is_con()) {
  return IntegerType::make(r0->get_con() & r1->get_con()); // Constant folding is trivial
}

// If both ranges are positive, the result will range from 0 up to the hi value of the smaller range. The minimum
// of the two constrains the upper bound because any higher value in the other range will see all zeroes, so it will be masked out.
if (r0->_lo >= 0 && r1->_lo >= 0) {
  return IntegerType::make(0, MIN2(r0->_hi, r1->_hi), widen); // In this case, both have a single simple interval, and the max of the result (which is the same as the unsigned max) is not larger than the min of either input.
}

// If only one range is positive, the result will range from 0 up to that range's maximum value.
// For the operation 'x & C' where C is a positive constant, the result will be in the range [0..C]. With that observation,
// we can say that for any integer c such that 0 <= c <= C will also be in the range [0..C]. Therefore, 'x & [c..C]'
// where c >= 0 will be in the range [0..C].
if (r0->_lo >= 0) {
  return IntegerType::make(0, r0->_hi, widen); // r0 will have a single simple interval, and the result will be the union of 2 sets both of which have the max being not larger than r0->_hi
}

if (r1->_lo >= 0) {
  return IntegerType::make(0, r1->_hi, widen); // Same as above
}

// At this point, all positive ranges will have already been handled, so the only remaining cases will be negative ranges
// and constants.

assert(r0->_lo < 0 && r1->_lo < 0, "positive ranges should already be handled!");

// As two's complement means that both numbers will start with leading 1s, the lower bound of both ranges will contain
// the common leading 1s of both minimum values. In order to count them with count_leading_zeros, the bits are inverted.
NativeType sel_val = ~MIN2(r0->_lo, r1->_lo);

NativeType min; // This takes into consideration that the result is negative iff both the inputs are negative, then uses the lower bound to infer the leading 1s in that case
if (sel_val == 0) {
  // Since count_leading_zeros is undefined at 0, we short-circuit the condition where both ranges have a minimum of -1.
  min = -1;
} else {
  // To get the number of bits to shift, we count the leading 0-bits and then subtract one, as the sign bit is already set.
  int shift_bits = count_leading_zeros(sel_val) - 1;
  min = std::numeric_limits<NativeType>::min() >> shift_bits;
}

NativeType max;
if (r0->_hi < 0 && r1->_hi < 0) {
  // If both ranges are negative, then the same optimization as both positive ranges will apply, and the smaller hi
  // value will mask off any bits set by higher values.
  max = MIN2(r0->_hi, r1->_hi); // Both ranges are negative, then similar to when r0->_lo >= 0 && r1->_lo >= 0
} else {
  // In the case of ranges that cross zero, negative values can cause the higher order bits to be set, so the maximum
  // positive value can be as high as the larger hi value.
  max = MAX2(r0->_hi, r1->_hi); // Consider the union of the results when inferring from 4 combinations of simple intervals of the inputs. If both simple intervals are in the negative range, the result is negative. Otherwise, the result will be not larger than the upper bound of the simple interval in the non-negative range.
}

return IntegerType::make(min, max, widen);

Comment on lines +357 to +358
// The unsigned value of the result of 'and' is always not greater than both of its inputs
// since there is no position at which the bit is 1 in the result and 0 in either input
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does not sound correct.

We could have ranges 0..0b1000 for both. But then both values are 0b0010, and so the result is 0b0010, which is a 1 at a position where both uhi values had zeros.

I think you need to talk about leading zeros somehow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this is not about the range, but about the value in an operation. I.e. If z = x & y then z u<= x && z u<= y. This leads to the fact that the upper bound of z is not larger than the upper bounds of x and y.

Comment on lines +371 to +372
// The unsigned value of the result of 'or' is always not less than both of its inputs since
// there is no position at which the bit is 0 in the result and 1 in either input
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue here as above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, if z = x | y then z u>= x && z u>= y. This means that the lower bound of z is not smaller than the lower bounds of x and y.

@eme64
Copy link
Contributor

eme64 commented Oct 15, 2025

Nice, thanks for all the updates. I responded to some of the points above.

@openjdk
Copy link

openjdk bot commented Oct 15, 2025

@merykitty hotspot has been added to this pull request based on files touched in new commit(s).

@merykitty
Copy link
Member Author

@eme64 I have removed the usage of std::unordered_map as well as added comments explaining the values of all_instances_size. Do you have any other concern?

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, that looks better already. Just looked over the diff, now going to look at the whole patch again.

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good. I'll run some internal testing now...

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@merykitty Thanks for working on this! Especially I'm happy with the extra gtest-ing that we are now able to do on the types. This optimization will be the entry point for many KnownBits optimizations, that is exciting!

This still needs a second thorough review though, since it is not trivial ;)

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 29, 2025
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Oct 30, 2025
Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still good :)

@openjdk
Copy link

openjdk bot commented Nov 4, 2025

@merykitty this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout andorxor
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Nov 4, 2025
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Nov 4, 2025
@merykitty
Copy link
Member Author

May I have a second review, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants